Automatic Garbage Collection and Finalization

Past versions of TADS required the game program to manage dynamically created objects; in particular, the program had to keep track of objects and explicitly delete them when they were no longer needed. TADS 3 eliminates this requirement by providing automatic garbage collection. The T3 VM automatically keeps track of which objects can still be used and which have become inaccessible, and from time to time deletes the inaccessible objects, making the memory they were using available for re-use. (We refer to inaccessible objects as "garbage," because they're just taking up memory without being of any further use to the program, and we refer to the process of recognizing and deleting these inaccessible objects as "garbage collection.")

For the most part, the garbage collector is invisible, so you can ignore it when writing your program. However, in some cases you might wish to be notified when the garbage collector is about to delete one of your objects. When you want such notification, you can use a "finalizer."

A finalizer is a special method, whose name is always finalize(); this method takes no arguments. When the garbage collector determines that an object has become unreachable, it checks to see if the object has a finalize() method. If the object does not have a finalize() method, the garbage collector can simply delete the object at any subsequent time. If the object does have a finalize() method, the garbage collector marks the object as "finalizable." Once an object is marked finalizable, the garbage collector can call the object's finalize() method at any subsequent time. Once this method returns, the garbage collector marks the object as "finalized." Once the object is marked as finalized, the garbage collector re-considers the object's reachability; at any subsequent time that the garbage collector determines that the object is unreachable, the collector can delete the object.

Note that the garbage collector must determine that an object with a finalize() method is unreachable twice before it can actually delete the object: the object must become unreachable once before the finalize() method can be called, and then must either remain unreachable or once again become unreachable before it can be deleted. The reason for the second reachability check is that the finalize() method could potentially make the object reachable again. Consider this example:

    MyGlobals: object

        finalizedList = []

    class MyClass: object

        finalize()

            MyGlobals.finalizedList += self;

When an instance of MyClass becomes unreachable, the garbage collector will at some point call the instance's finalize() method, which adds a reference to the instance to MyGlobals.finalizedList. Since MyGlobals is a named object, it's always reachable, hence anything that MyGlobals.finalizedList refers to is reachable – this means that the instance being finalized once again becomes reachable. So, the garbage collector cannot actually delete this object until the references is removed from MyGlobals.finalizedList, at which point the instance once again becomes unreachable (assuming it hasn't been referenced anywhere else in the meantime).

The garbage collector calls an object's finalizer only once, even if the object becomes reachable again while it's being finalized. The single finalizer call is enforced by the state transitions: an object is initially unfinalized; after the garbage collector first notices that it is unreachable it becomes finalizable; after the garbage collector calls the finalizer the object becomes finalized. Once an object is finalized, it is deleted as soon as the collector notices that the object is unreachable. The garbage collector can only call the finalizer on a finalizable object, and once the finalizer is called the object becomes finalized; it cannot return to the finalizable state from the finalized state.

Note that the garbage collector does not run continuously, but only at certain times; exactly when the collector will run is unpredictable, because it depends on what memory operations the program performs, but it's also not usually important, since the program can largely ignore the collector's operation. Because of the unpredictable timing of garbage collection, the timeline descriptions above are intentionally a little vague; the only thing that's certain is the order of events, not their exact timing. So, an object might be finalized very quickly after it becomes unreachable, or it might sit in memory for a long time before the garbage collector gets around to finalizing the object.

Note also that you can explicitly invoke the garbage collector with the t3RunGC() function in the "t3vm" function set.

Implementation Details

For those interested in academic details, the T3 VM implementation in TADS 3 uses a synchronous "tracing" garbage collector.

A tracing collector traverses the entire set of accessible objects, starting with the "root set." The root set is the set of objects that are directly reachable to the program, such as local variables and static objects defined in the source code. The garbage collector marks each root set object as reachable, then marks as reachable each object to which a root set object refers, then marks as reachable each object to which those objects refer, and so on. This process continues until the collector has marked every object that can be reached directly or indirectly through references from root set objects. Any objects not marked during this tracing process are unreachable, and hence can be deleted.

The TADS 3 collector is synchronous, which means that the VM must suspend its other operations to run the garbage collector. However, on systems that support background threads, the TADS 3 implementation can run garbage collection in the background while certain types of user interface activities, such as reading a command line, are taking place; these operations have no interaction with the memory manager, hence they can proceed simultaneously with garbage collection. (Non-blocking garbage collectors have become rather trendy in recent programming language systems, but a blocking collector seemed appropriate for TADS 3. Non-blocking collectors add considerable complexity to the entire memory management system, and the experience of most system designers has been that synchronous collectors usually have higher overall throughput, perhaps owing to their greater simplicity. The primary advantage of non-blocking collectors is that they tend to spread out their work more evenly in time, making them more suitable for real-time systems; this didn't seem a strong enough justification for TADS 3.)